COFFIN: A Computational Framework for Linear SVMs

نویسندگان

  • Sören Sonnenburg
  • Vojtech Franc
چکیده

In a variety of applications, kernel machines such as Support Vector Machines (SVMs) have been used with great success often delivering stateof-the-art results. Using the kernel trick, they work on several domains and even enable heterogeneous data fusion by concatenating feature spaces or multiple kernel learning. Unfortunately, they are not suited for truly large-scale applications since they suffer from the curse of supporting vectors, i.e., the speed of applying SVMs decays linearly with the number of support vectors. In this paper we develop COFFIN — a new training strategy for linear SVMs that effectively allows the use of on demand computed kernel feature spaces and virtual examples in the primal. With linear training and prediction effort this framework leverages SVM applications to truly large-scale problems: As an example, we train SVMs for human splice site recognition involving 50 million examples and sophisticated string kernels. Additionally, we learn an SVM based gender detector on 5 million examples on low-tech hardware and achieve beyond the stateof-the-art accuracies on both tasks. Source code, data sets and scripts are freely available from http://sonnenburgs.de/soeren/coffin.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Training of Graph-Regularized Multitask SVMs

We present an optimization framework for graph-regularized multi-task SVMs based on the primal formulation of the problem. Previous approaches employ a so-called multi-task kernel (MTK) and thus are inapplicable when the numbers of training examples n is large (typically n < 20, 000, even for just a few tasks). In this paper, we present a primal optimization criterion, allowing for general loss...

متن کامل

Genetic Programming for Kernel-Based Learning with Co-evolving Subsets Selection

Support Vector Machines (SVMs) are well-established Machine Learning (ML) algorithms. They rely on the fact that i) linear learning can be formalized as a well-posed optimization problem; ii) non-linear learning can be brought into linear learning thanks to the kernel trick and the mapping of the initial search space onto an high dimensional feature space. The kernel is designed by the ML exper...

متن کامل

APPLICATION OF KRIGING METHOD IN SURROGATE MANAGEMENT FRAMEWORK FOR OPTIMIZATION PROBLEMS

In this paper, Kriging has been chosen as the method for surrogate construction. The basic idea behind Kriging is to use a weighted linear combination of known function values to predict a function value at a place where it is not known. Kriging attempts to determine the best combination of weights in order to minimize the error in the estimated function value. Because the actual function value...

متن کامل

An Efficient Classifier Based on Hierarchical Mixing Linear Support Vector Machines

Support vector machines (SVMs) play a very dominant role in data classification due to their good generalization performance. However, they suffer from the high computational complexity in the classification phase when there are a considerable number of support vectors (SVs). Then it is desirable to design efficient algorithms in the classification phase to deal with the datasets of realtime pa...

متن کامل

Scalable, accurate image annotation with joint SVMs and output kernels

This paper studies how joint training of multiple support vector machines (SVMs) can improve the effectiveness and efficiency of automatic image annotation. We cast image annotation as an output-related multi-task learning framework, with the prediction of each tag’s presence as one individual task. Evidently, these tasks are related via dependencies between tags. The proposed joint learning fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010